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Abstract 

Adaptive Cluster Sampling (ACS) is introduced as a technique to use when "natural" groupings are 
evident in a spatially distributed population, especially sparsely distributed populations. An ACS 
sampling design will allow efficient allocation of survey manpower with more effective decision rules 
for where/when to allocate those resources. In particular, given a clustered distribution, ACS would 
result in a lower variance of the mean estimator than Simple Random Sampling (SRS). This paper 
derives a linear inequality based on a ratio of variances and the SRS sample size to determine the 
conditions under which ACS would be a more appropriate sampling strategy. This inequality could 
be used to make preliminary decisions on the potential benefits of pursuing an ACS design over 
an SRS design. The relationship between relative efficiency of ACS over SRS is discussed for (1) 
variable conditions for neighborhood expansion (Conditions to Adpat) (2) the degree of clustering 
and (3) the percent of hits in a particular neighborhood. Simulation results demonstrate how 'rare' 
clusters should be for ACS to be a better design than SRS. The relationships that emerge from the 
simulation provide insights that were not apparent in the analysis. The results are important in 
the estimation of thresher shark landings along the California coast. 

Keywords: Adaptive Cluster Sampling, Simple random sampling, Realtive efficiency 



1. INTRODUCTION 

Natural resource management applications are rich in situations where distributions of items to 
be sampled occur in patterns that are clustered in space or time. This usually occurs because 
there are inherent associations (e.g., behavioral, environmental) between items or organisms that 
may be unknown to the observer or scientist but which essentially determine the properties of 
that distribution (E.g., means, variances, characteristics of clusters etc). Several sampling designs 
such as simple random, stratified, systematic, systematic cluster, unequal probability, traditional 
cluster, adaptive cluster etc. are used to estimate these distribution properties. (?) compared 
these techniques applied to widespread aggregation of larval walleye pollock in the Gulf of Alaska. 
In contrast to the preceding, in this paper, we focus on adaptive cluster sampling and attempt to 
compare it to traditional cluster sampling and simple random sampling. A comprehensive review 
of adaptive cluster sampling can be found in (?), (?) and (?). (?) developed an adaptive cluster 
sampling design to estimate the abundance of rockfish populations. Along similar lines, an adaptive 
sampling design was developed for application to survey sampling of catches of thresher sharks in 
a Southern California sport fishery (?). 

In the above application, estimators were derived to estimate the abundance of common thresher 
sharks (Alopias Vulpinus), a U.S. federally managed species (PFMC, 2004), caught in sport fishery 
primarily off the coast of Southern California. Problems in the estimation of sport fishery char- 
acteristics are a different challenge from the estimation of the same properties from commercial 
fisheries where extensive capture gear is employed. In this particular sport fishery, the California 
Department of Fish and Wildlife monitors public boat ramps, docks, and marinas where sport 
vessels return after fishing for thresher sharks. Marinas are located along the coast of California 
from Point Conception to the Oregon coast. Some of these marinas are clustered close to city ar- 
eas. Captures are generally from fishing relatively close to the coastline, usually within state water 
boundaries (i.e., out to three nautical miles). Fishing gear for thresher shark sport fisheries is less 
of an issue than for commercial thresher shark fisheries since sport fisheries are restricted to one 
fishing rod per individual fisherman, each of whom has a license. There may be multiple anglers 
and/or rods on any given vessel trip but the Department of Fish and Wildlife samplers query for this. 



The common thresher shark is a member of the genus Alopidae distinguished from other sharks 
by the presence of a large asymmetric split tail fin that it uses to stun and capture its preferred prey 
consisting primarily of mackerel, sardine, and squid. It is classified as a migratory species and thus 
falls under international laws that regulate migratory species. (Article 64 of the UN Laws of the Seas 
(UNCLOS)). Therefore, there are national and international responsibilities to monitor the catch 
of these sharks, bearing in mind that the word catch is a euphemism for mortality. The sport fish- 
ery targets sharks of all age classes with the predominant catch being juveniles (pre-reproductive) 
from the first to the fifth year of their lives. During the spring to early summer months, older and 
actively reproducing thresher sharks will aggregate to feed and when conditions favorable to give 
birth to pups. There is a demographic relationship between juveniles and the mature adults so 
that over-harvesting of the young ones may lead to an insufficient number moving into the mature, 
reproductive category, leading to the possibility of local depletion and over-exploitation , a phe- 
nomenon that occurred back in the late 1980s-early 1990s when the U.S. commercial drift gillnet 
fishery for swordfish and common thresher sharks was fished at an unsustainable level . Conser- 
vation and management measures were put in place to address the sustainability issues leading to 
a rebound in the population. These conservation measures focused solely on commercial fisheries 
leaving unchanged the regulations on recreational shark fisheries. These regulations include a 2 
shark per day per angler bag limit and no season or size restrictions. 

Adaptive Cluster Sampling (ACS) is applied to find an estimate of the number of threshers caught 
in particular time intervals over a range of areas (sampling units consisting of public launch ramps 
and marinas), in which clustering of boats returning with shark catches have been observed. The 
sharks have demonstrated a north-south movement pattern along the coast responding to cues such 
as water temperature, food etc. An earlier study used a stratified simple random sampling design 
of the public launch ramps and marinas and provided in some instances, estimates with huge vari- 
ances. In that study, the California coast was split into strata, north-south along the coast; within 
each stratum random sites were chosen for sampling. The ratio: the number of randomly chosen 
sites per stratum/ the total number of sites in that stratum (sampling proportion per stratum) 
across all strata was held constant, to derive the necessary estimators. 



Traditional cluster sampling is an alternative design to stratified simple random sampling to 
estimate catch. In this method, the estimate for the variance of the mean number of boats returning 
with shark catch will require prior knowledge of the number of public ramp and marina clusters 
and consequently the number of public ramps and marinas in each cluster, which would then be 
subsampled to get the total number of boats with shark catches. However, criteria to define clusters 
are not available, thus sample designs are not necessarily unique. The characteristics of ACS that 
fit the circumstances not achieved by cluster sampling are (a) it adapts to whether selected sample 
units fall within a cluster and (b) adapts to detecting a cluster, if one exists, without apriori knowl- 
edge of their existence . Achieving a theoretical comparison of the efficiency of traditional cluster 
sampling and ACS seems impractical, because of the fundamental difference in implementing the 
two designs. In ACS, one does not know apriori the sample size needed to achieve a certain level of 
variance, since the sample size evolves as the sampling activity progresses. In contrast, traditional 
cluster sampling needs complete knowledge about the number of clusters and within each cluster, 
the primary sampling units, to achieve a certain a level of the variance of an estimate. 

Appendix A presents estimators from traditional cluster sampling and clarifies why it is im- 
practical to make a theoretical comparison between traditional cluster sampling and ACS. ACS is 
equivalent to SRS if no clusters are found. Thus, comparison of ACS to SRS is straightforward and 
more natural in terms of the mathematics while comparison of ACS to traditional cluster sampling 
would be natural with respect to clusters. In this paper, the authors compare ACS vs SRS and 
include a simulation study that provides insights into the sensitivity and efficiency of ACS relative 
to the degree of clustering. 

The outstanding advantage of adaptive sampling is its suitability for sparsely distributed, clus- 
tered populations. When comparing sampling designs, the efficiency of one design over another is 
measured in terms of relative variances. Another important component of efficiency is measured in 
terms of the cost to carry out the sampling design. The relative efficiency addressed in this paper 
is in terms of variances only, which is the basis for comparing ACS to SRS sampling designs. 



2. METHODOLOGY 

In an adaptive cluster sampling design, an initial random sample of n\ sample units is selected 
from N units of the population. Each unit in the population may represent a quadrat or some fixed 
region with an associated neighborhood. The neighborhood of sample unit i is defined as unit i plus 
all the adjacent units that satisfy a certain threshold criterion to determine if additional samples 
should be taken in that neighborhood. Whenever the y-value of unit i in the sample satisfies a 
criterion (e.g. yi > C ), where C is a specified constant), all units in the neighborhood of i are added 
to the sample. This process is iterated for every sample unit that satisfies the threshold criterion, 
including the ones that are newly added to a network, where a network is defined as a subset of all 
sample units within a cluster such that if any sample unit of the network is selected, all other plots 
of this network will enter into the sample. For rare event sampling, such as the theresher shark 
application, a criterion j/j > 1 is used for expanding neighborhoods, since the problem involved 
sampling for a rare species, thus even a small criterion was enough to trigger an adaptive sampling 
network. As can be clearly seen, specification of a criterion will vary from one study to another 
and is usually done in consultation with people familiar with the system and the response variable 
and its behavior. It is noted here that just as stratified SRS improves precision over SRS, similarly 
ACS with stratification can reduce the variance over a standalone ACS design. 

3. DEFINITION OF SYMBOLS USED 

The symbols used in this paper are in concurrence with the symbols used in (?) and (?). 

Let N - Number of units in a finite population with a variable of interest yi in the i th unit. 

m - Number of initial random sample units chosen from units in the population for the Adaptive 

Cluster Sampling (ACS) Design. 

K - Number of networks in the population. 

yj - Number of observations of the variable of interest in sample unit j . 

k(i) - Label for the network that includes unit i. 

Bk - Set of units that are present in the k th network. 

mui) - Number of units in network B k u\ . 

wu{\- Average of the y- values of the units in the network that include unit i . (It is easily seen 



that w k(i) =^S ieBfc% - ). 

m - Number of sample units for a simple random sample where units are chosen randomly from N 

units. 



var(acs) Variance estimator for the estimated mean as a result of adaptive sampling. 



var(srs)- Variance estimator for the estimated mean as a result of simple random sampling. 

4. DERIVATIONS 
Thompson (1996, pp. 151, eq. 5.1) provides an expression for the relative efficiency of a conventional 
simple random sampling (SRS) design with sample size m, to the adaptive cluster sampling (ACS) 
strategy of initial sample size m in terms of their respective variances. This is given by: 



var(acs) _ m N - m r T, k=1 Y, ieBk (,yi — «>fc(i)) - 

'L T.N 77,. _ ,,\2 - 



(1) 



var(srs) mN-m 1 ^iLiiVi ~ A*) 2 

Thus, adaptive sampling will have a lower variance than simple random sampling if the within- 
network variance of the population is sufficiently high and occurs in patches. Moreover, for given 
sample sizes m and m, adaptive cluster sampling will be more efficient if the within network 
variation represents a large proportion of the overall variation, since larger the proportion, smaller 
the quantity within equation (1). The adaptive criterion C that defines the conditions for expansion 
of a neighborhood plays a critical role here. This criterion governs the partitioning of networks in 
the population and is directly proportional to the characteristics of highly aggregated clusters that 
occur in the population. Equation (2) below describes the conditions under which ACS will be a 
"superior" sampling design compared to SRS, ie ACS will lead to a lower variance than SRS. The 
derivation for this inequality of provided in Appendix B. 

/I 1 x 2 < N ~ re l ^k=l^B k (Vi ~ w k(i)) 2 (2) 

Immediate observations from Equation (2): 

1. If m = m, then the variance of estimators obtained using adaptive sampling is always lower 
than the variance obtained using simple random sampling. This is because the left-hand side 
(LHS) becomes and the right-hand side (RHS) is always a strictly positive quantity. 



2. If m > m, then the LHS is a negative number and the RHS is a positive number. Thus, 
adaptive sampling will perform better than SRS. 

3. Adaptive sampling will require a smaller initial sample to obtain the same precision as SRS. 
That is, n\ < m . 

In general, for a complex survey design, we usually need a larger sample size to generate the 
same level of precision as SRS. It is only the initial number of samples for adaptive sampling 
that is claimed to be lower than those required for SRS. At the end of the study, with expanded 
neighborhoods grouped in clusters, the number of sample units sampled is higher. Thus, we may 
achieve better precision using adaptive cluster sampling than simple random sampling for a fixed 
cost, since traveling from one site to another is not random and depends upon the neighborhood 
of the site visited on the previous sample day. That is, even if a larger number of sample units 
are traversed, the cost may be lower for the same, or greater level of precision as SRS. Equation 
(A. 7) in Appendix B expresses the variance of the population from two different sampling designs, 
the one from SRS and the double summation proportional to the ACS variance. The relationship 

2 

between these variances and the variance of the means follows. For SRS, it is — , and for ACS, it is 
given by equation(4) where /i is the mean estimator for ACS. The sum of squares for the response 
variable can be considered to be a sum of within network and between network components, similar 
to a one way ANOVA (Thompson 1996) i.e 

^f=i(yi ~ P) 2 = Sf =1 S ieBfc (yi - w k{i) ) 2 + E£Li(u>k(i) - aO 2 (3) 

Therefore the variance of the adaptive cluster sampling estimator is found from the above by 
multiplying both sides of the equation by — n7n~D an< ^ rearran gi n g terms, i.e. 



mJV(JV-i) 
ni N(N - 1) [ ^= l{yi " ^ ~ S *=i s ^ fc (y* - w k{i) f] (4) 



VarW- N ~ ni -A- ,.. .,,, wv 



Rearranging equation(4) leads to 



7V(JV-l) E *= lSfeBfc ^ ~ Wk ^ 2 = ni N{N-l) ^ =1 ^ yi ~ ^ ~ Var (^ ( 5 ) 



m 

Plugging the right hand side of the above into (2), we get 

1 i x 2 . N-m ^ N , 



( y < ' Sgafa - HY ~ Var(fi) (6) 

n\ m n\Pi (TV — 1J 



i.e. substituting the srs population variance, a z 



(- " ~y < (^?V - Var(fi) (7) 



Rearranging and factoring, we get 

/ 1 1 



Ja 2 > Var(jl) (8) 



which can be rearranged to highlight the ratio of variances as: 



11 

m TV 



where k = ar ^> 



If N < m, the inequality cannot be valid. In the event of total enumeration, i.e., when N = m, 

variances are zero, which is the limiting case of the inequality. Thus, irrespective of the initial 

random sample size chosen for adaptive sampling, if (8) and (9) hold, ACS performs better than 

SRS. The last option of m < N is more complex and can be expressed in terms of either the N, m 

relationship or the ratio of variances. To further investigate the relationship in (9), multiply both 

sides of (9) by m. Since k = ar 2 w and since — is the variance of the sample mean from SRS 

(without the finite population correction (FPC)), (9) can be rewritten as 

m Var(fl) _ m Var(jl) 

N + («1) N + Var(y- m ) [W) 



i.e. 



m 
1>^ + «1 (11) 

Where Kl = V^yt) = mK 

Equation (10) can be rearranged as: 

N(l-K 1 )>m (12) 

The linear inequality (12) partitions the space defined by the SRS sample size (m) and the ratio 
of variances of the means of ACS and SRS,ki. The line (when (12) is an equality) has a vertical 
intercept of population size (N) (total enumeration m = N) and a negative slope of N. This defines 
two regions, A^ — Nk± = m in a (ni,m) space. The region above this line m > N is impossible. 
K\ = implies m = N (total enumeration) and, where m < N with K\ < 1 implies the feasible 
region. This region, below the line defines the feasible solution set of all combinations of m and the 
ratio mean variances, where ACS is a superior sampling strategy. 

10 



5. SIMULATION OF SAMPLING ON SPATIAL DATA 
The theory of spatial distributions makes use of an index, the Variance to Mean Ratio (VMR) to 
describe different spatial patterns. VMR indices are also known as: index of dispersion, dispersion 
index, coefficient of dispersion, and the coefficient of variation. The VMR is a normalized mea- 
sure of the dispersion of realizations of a probability distribution. For example, standard statistical 
models such as Poisson, Binomial etc. are often associated with particular random patterns. Math- 
ematically, the VMR is defined as the ratio of the variance a 2 to the mean \x of the random variable 
associated with a spatial distribution. The Poisson distribution has equal variance and mean, giv- 
ing it a VMR = 1. The geometric and negative binomial distributions each have a VMR > 1, 
while the binomial distribution has VMR < 1, and the uniform distribution has VMR = 0. Each 
of these distributions VMR is associated with a spatial pattern described in terms of the dispersion 
of the realization. This relationship between VMR and the spatial distributions is summarized and 
described as follows: 

Table 1: Relationship between distributions, VMR and their respective descriptions 



Distribution 


VMR 


Description 


Uniform 





not dispersed 


Binomial 


< VMR < 1 


under dispersed 


Poisson 


VMR=1 


random 


Negative binomial 


VMR> 1 


over dispersed 



Spatial patterns can be generated from simulations based on the negative binomial and Poisson 
pdfs, since they would generate over dispersed data which in turn correspond to clustering. The aim 



Var(SRS) 



as data become more clustered. In our 



is to study relative precision (defined as the ratio of y ar (^ cs \ 
application, let a hit be the return of a vessel to a particular marina with at least one shark capture 
on a particular day. Different spatial distributions of hits can be simulated to correspond to spatial 
patterns of different marinas with hits. For completely unclustered data, the Variance/Mean ratio 
is 1 (Poisson). As this ratio is increased (data simulated from a negative binomial), the simulated 
data begin to show clustering. The spatial distribution of recreational fishing boats in the marinas 



II 



can assume different forms, which, in this paper, range from random to clustered. This could 
represent, e.g., the marinas distributed along the California coastline, where anglers return with 
thresher shark catch. 

Once data is simulated, both SRS and ACS can be applied to these data. A similar study was 
conducted by (?) et. al. to estimate plant disease incidence for varying Conditions to Adapt (CA), 
where CA is a preset criterion (number of hits) to be satisfied before further samples are taken 
in the neighborhood. If the numbers of hits are below the CA, then the next days' neighborhood 
samples are determined by SRS. The conditions to adapt are typical of a rare event, e.g. number of 
thresher shark catches on a particular sample day. It has been shown that ACS is most effective for 
rare events where the response variable of interest (here, number of hits) is infrequently observed 
(Thompson 1996; Pg. 8,9). Therefore, as data become more clustered (indicated by increasing 
V/M ratio), the relative precision can be shown to be higher for a lower condition to adapt (e.g., 
CA=1). It can also be seen that as the hit level increases, relative precision decreases (i.e, if more 
boats return with a thresher shark catch per day, the relative precision decreases) . This implies that 
relative precision is higher for rare, clustered data with a low condition to adapt. To summarize, 
the results as addressed in (?), which is also applicable in this shark application are : 

1. Relative precision increases as data become more clustered (indicated by an increasing V/M 
ratio) . 

2. Relative precision is inversely related to Conditions to Adapt (CA), i.e., the lower the CA, 
the higher the relative precision (e.g., if CA=1, the relative precision is the highest). 

3. Relative precision decreases as the hit level increases. That is, if more boats return with a 
thresher catch per day, the relative precision decreases. 

6. ACS VS SRS FOR VARYING SPREAD OF CLUSTERS 

Adaptive cluster sampling (ACS) has been recommended in situations where sparsely distributed 
populations occur in clusters or special groupings. Intuitive examples generally include animal 
populations found in pods (orcas), prides (lions), packs (wolves), and the thresher shark instance 
that has been discussed as part of this paper. In each of these animal examples, small groups of 
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animals are likely to avoid other such groups, maintaining large distances between groups. 
One can imagine an extreme example, where the centers of two or more clusters are so close relative 
to their spread that the overall population seems more evenly or randomly distributed. In such an 
instance the clusters would almost entirely overlap. For such an extreme example, it would seem 
that simple random sampling (SRS) would be the preferred design; although ACS may still yield 
slightly more precise estimates, the additional cost and complexity incurred by ACS may deter use 
of this scheme. This section begins an examination into whether there is a gradient of relative per- 
formance between ACS and SRS given varying relative distances between clustered subpopulations. 
Data were simulated in R and both sampling schemes executed; population and variance estimates 
for both methods are compared. 

A two-dimensional sampling frame of N = 400 units (20 by 20) was set up, and x and y coordinates 
of 5 cluster centers were randomly chosen using a uniform distribution. In order to simulate varying 
spread of clusters, a bivariate normal distribution was used to generate 50 data points centered 
around each of the 5 cluster centers. Five different spreads of clusters were analyzed: the bivariate 
normal standard deviation was set at , unit, 1 unit, | units, 2 units, and 3 units respectively. This 
scheme resulted in between 200 and 250 data points within the frame; as some of the 250 points 
sometimes fell outside of the sampling frame, these were not considered. Figure 1 shows the data 
points generated for each of the 5 standard deviations used. 

For each of the 5 simulated populations, n=10 initial random samples were chosen 100 times. For 
the simple random sampling scheme, counts were taken for these 10 samples and estimates cal- 
culated. For the adaptive cluster sampling scheme, the 10 initial samples were used to build an 
ACS sample, where C > in a unit triggered sampling of the adjacent 4 units, as outlined in (?). 
Calculated estimates for ACS were compared to those found using SRS for each of the 100 sampling 
experiments and are averaged and presented in Table 2. The R-code for Table 2 and Figure 1 are 
in Appendix 3. 

Examination of Figure 1 shows marked differences between clusters simulated using different vari- 
ance parameters from a bivariate normal distribution. The first two figures in Figure 1 (SD=2/3 
and SD=1) look like what we would expect from clustered animal populations, the last two (SD=2 
and SD=3) seem more randomly distributed than clustered and the middle figure (SD=3/2) is 
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somewhere in between. 

Comparing the relative efficiency estimates in Table 2, the improvement in ACSs performance 
relative to SRS does improve with tighter, more widely spaced clusters as is to be expected (ap- 
proximately twice for the most clustered populations). Variance estimates over 100 sampling ex- 
periments for the two least clustered populations were only 40-50 percent better using ACS than 
SRS; given the additional cost of ACS it might be more effective in these examples to use SRS. 
The in between population, where the standard deviation of the bivariate normal distribution used 
to generate the data was 1.5 sampling units, shows that ACS might be a better design since it per- 
forms 75 percent better. It is also noted that these results are the result of one replicated simulated 
study and replication of the same study sometimes yielded only 20 percent improvement of ACS 
over SRS for the least clustered data. The question remains, is such a reduction worth the extra 
cost and complexity of adapting ACS? While this paper does not explore a cost effective approach, 
it is well worth additional exploration and should be one of the more important questions facing 
individual study designs. 
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Table 2: ACS and SRS estimates for various simulated populations. Relative precision increases 
as clusters become rare and ACS is a better design. Estimates for 10 initial samples, repeated 100 

times 





Actual 


H(acs) 


u.(srs) 


Var(acs) 


Var(srs) 


Rel. 

Precision 


sigma 




estimate 


variance 


estimate 


variance 


estimate 


estimate 




0.667 


249 


251.09 


37156 


242.4 


55706 


487222 


937144 


1.923 


1 


249 


253.25 


21193 


250.8 


47803 


374391 


687979 


1.838 


1.5 


244 


250.02 


16306 


266.8 


31521 


221193 


389780 


1.762 


2 


227 


258.43 


12509 


241.6 


15677 


188773 


272362 


1.443 


3 


209 


206.29 


6840 


206.8 


9921 


112530 


150372 


1.4462 



15 



Simulated populations(Biv. Normal, SD=0.667) 
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Figure 1 : Spatial distribution of simulated populations for varying degrees of clustering 
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7. DISCUSSION 
Adaptive Cluster Sampling differs from traditional sampling designs because in ACS it is impossible 
to apriori pick an initial sample size (ni) to achieve a preset variance of sampling estimators. Thus, 
fixing m as seen in inequality (2) is not practical. However, use of the ratio of the variance 
estimators for each sampling design is useful, since it allows one to preset the efficiency of ACS 
relative to SRS, in terms of: ni, the SRS sample size and, the desired precision (variance) ratio, k\ 
(Eqn. 12). 

Recalling the well-known inverse relationship between variance and sample size, which translates 
to sampling cost, an adaptive design is more complicated than conventional designs and probably 
more costly to execute. ACS will perform better (lower variance estimates relative to SRS) for 
patchily distributed populations (Thompson, 1996) and, would perform less well in a population 
where the sample units are not clustered. That is, where sample units are either randomly or 
uniformly distributed. 

Having addressed ACS relative to variance and sampling design, consideration of results relative 
to the application to sampling for rare events, is useful. How rare is rare has been discussed by 
simulating populations and noting that for tighter clusters, ACS reveals almost twice the efficiency 
of a traditional SRS design. For unclustered populations or when clusters significantly overlap, 
ACS still proves to be a better design, but given the additional cost, it involves further exploration 
on its relative merits. 

Adaptive sampling designs are, in fact, not found in many sampling books and courses, although, in 
at least one case, a whole book is devoted exclusively to adaptive sampling designs (?). Nevertheless, 
in teaching adaptive sampling in other contexts, the second author has often had to provide guidance 
as to when the selection of adaptive cluster sampling designs will improve over cluster or simple 
random sampling designs. Referring to equation (2) was the best that could be done. The authors 
believe the result obtained in Equation (12) and its ramifications is a step forward. Further insight 
follows from the simulation results. 
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APPENDIX A. VARIANCE ESTIMATOR FOR THE MEAN IN TRADITIONAL CLUSTER 

SAMPLING 

ACS may be confused with traditional cluster sampling, but they are quite different. They are 
similar is that both refer to a spatial configuration consisting of clusters of sample units which 
contain the response variable of interest. In this paper, as well as the primary source (Thompson 
1996), ACS is not typically compared to traditional cluster sampling, but rather to SRS. And in 
typical sampling textbooks, traditional cluster designs are also compared to SRS designs, Thus 
SRS becomes the baseline design for comparison in Section 4. This is legitimate because most texts 
compare traditional cluster sampling to SRS. This appendix essentially motivates why traditional 
cluster sampling cannot be directly compared to ACS. The following symbols are used in the ex- 
pressions involving the means and variance estimators of traditional cluster sampling. 
Let, 

iV-Number of clusters in the population. 
Mj-Number of sample units in cluster i, i = 1,2, ....N. 
Yij- value of the response variable Y in sample units j of cluster i. 

A.l Estimates at the level of the individual clusters Mi 

Mq = S^ 1 Mj-Total number of sample units in the population. 

M = -^p-Mean number of sample units per cluster. 

Yj = £ J 1 lij-Total value of the response variable Y in cluster i. 
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Yi = -r+-Mean value of the response variable Y in the sample units of cluster 



Mi 



A. 2 Population parameters at the level of all clusters in the population N 

Y = E^L^-Total value of the response variable Y of all the sample units in the population. 

Y = -^-Mean value of response variable Y per cluster. 

Y = jj- = -^rpr-Mean value of response variable Y per sample unit. 

A. 3 Sampling estimators from a cluster design 
The mean of the response variable per sample unit is: 

Y = *=* - , which has sampling variance given by: 
V[Y] = ^ where V[Y] = ^^(j* - V? 

From the expressions above, it is clear that to derive estimators (means and variances) for traditional 
cluster sampling, one needs complete knowledge of the number of clusters, the number of clusters 
sampled and the number of primary sampling units per cluster. Thus, traditional cluster sampling 
can be compared with SRS since, from a design perspective, they are analogous to each other. 
In contrast, ACS does not allow the estimation of the sample size needed to achieve a variance of 
the estimate of the mean, rather the sample size evolves from the sampling activity. This paper 
attempts to circumvent this issue by deriving equation (12), that allows to compare the efficiency 
of ACS vs SRS, by getting rid of the final ACS sample size from the equation. One consequence of 
the above relationship that has not been addressed in this paper, is that the cost of the sampling 
designs are different. In ACS, cost depends on the number of sample units sampled in a network, 
which is difficult to know apriori. In some cases, the ACS design will be logistically less expensive 
since successive samples in a neighborhood may be spatially close to each other, compared to a 
samples selected by SRS, where the sample units maybe spatially distant from each other. This 
paper is not meant to address this issue of cost, and focuses on the efficiency of sampling designs 
purely from the perspective of relative variances. 
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APPENDIX B. INEQUALITY TO DETERMINE WHEN ACS WILL OUTPERFORM SRS 

If adaptive cluster sampling outperforms SRS, then 



Var(SRS) 
This implies from Equation 1, that: 



Var{ACS) < 1 (A.1) 



m N - m r S fc=1 Sj Sj B fc (yi - tt^(j)) 



^- ";r,7 j ]<i (a.2) 



ni N-m L X^yi-ti) 



i.e. 



^k=i^i£B k (yi - Wk(i))\ ni N 



m 



[1 ^(Vi-H) 2 ^mN-n, (A ' 3) 



i.e. 



m N — m T, k=l T, i( zB k {yi - Wk{i)Y 
mN-m < ^f =1 ( yi -vf 



1--=tt-zt< W? ..", ( A - 4 ) 



i.e. 

■yR 



mN- ni N Zk =1 EieB k (Vi - w k(i) ) 2 



m(iV-m) ^(Vi-H) 2 

Dividing both sides by mN(N — 1) and cross multiplying, we get 

, rn-ni -. Y,g =1 (yi-iJ,) 2 , N - ni s s f=i s igB fc (i/i ~ Wk{i)) 2 ( . , 

\ mm ' N-l \ mN ' N - 1 l ' ' 

Note that the terms not within the parantheses on the LHS represent the SRS population variance 
with mean \x. Thus, 

t " ™ )fT < { ^jr> w^i (A - 7) 

Here a 2 is the population variance. The equation is similar to (?)pp. 275. 

APPENDIX C. R CODE FOR SIMULATING CLUSTERED POPULATIONS 

library (fMultivar) 

xmax<-20 

ymax<-20 

N<-xmax*ymax 

cent ers<-cbind (round (runif (5, l,xmax-l) ,2) , round (runif (5, l,ymax-l) ,2)) 

cluster. K-rnorm2d(50)*l+c (rep (centers [1, 1] ,50) , rep (centers [1,2] ,50)) 
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cluster . 2<-rnorm2d(50)*l+c (rep (centers [2, 1] ,50) , rep (centers [2,2] ,50)) 

cluster .3<-rnorm2d(50)*l+c (rep (centers [3, 1] ,50) , rep (centers [3,2] ,50)) 

cluster . 4<-rnorm2d(50) *l+c (rep (centers [4 , 1] , 50) , rep (centers [4,2] , 50) ) 

cluster .5<-rnorm2d(50)*l+c (rep (centers [5, 1] ,50) , rep (centers [5,2] ,50)) 

simul . data . 2<-rbind( cluster . 1 , cluster . 2 , cluster . 3 , cluster . 4 , cluster . 5) 

n<-10 

f rame<-data . frame (rep (0 , N) ) 

f rame$x<-trunc ( (row(f rame) -1) /20) 

f rame$y<-row(f rame$x) -20*f rame$x-l 

for (i in 1 : length (frame [, 1] )) 

{frame [i, 1] <-length( which (simul. data. 2 [, 1] >=frame$x[i] & 

simul. data. 2 [,l]<frame$x[i]+l & 

simul.data.2[,2]>=frame$y[i] & 

simul . data . 2 [ , 2] <f rame$y [i] +1) ) 

} 

sum(frame [, 1] ) 

mu . acs<-rep (0 , 100) 

var .acs<-rep(0, 100) 

mu. srs<-rep(0, 100) 

var . srs<-rep(0, 100) 

for (h in 1:100) { 

rand . sampOsample (N ,n) 

samp . f rame<-matrix ( 1 : N , xmax , ymax) 

rand. ind<-arraylnd(which(samp. frame %in°/ rand. samp), dim (samp, frame)) 

colnames(rand. ind)<-c("x" , "y") 

names (frame) <-c(" count" , "x" , "y") 

srs . 2<-f rame$count [rand . samp] 

count<-matrix (0 , xmax , ymax) 
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for (i in l:xmax) { 
for (j in l:ymax) { 

count [i, j]<-length( which (simul.dat a. 2 [, 1] >=i-l & 
simul.data.2[,l]<i & simul.data.2 [,2] >=j-l & simul.data.2[,2]<j)) 
} 
} 
sum (count) 

network<-array (0 , c (xmax , ymax , n) ) 

edge<-array (0 , c (xmax , ymax , n) ) 

y<-rep(NA,n) 

m<-rep(NA,n) 

newnet<-array (0 , c (xmax , ymax , n) ) 

look< -matrix (0 ,4,2) 

for (i in l:n) { 

newnet [rand. ind[i, 1] ,rand. ind[i,2] ,i]<-l 

while (! identical (newnet [, ,i] , network [, ,i] )) { 
network [,,i] <- newnet [,,i] 

net .ind<-arraylnd( which (network [, ,i]==l) , dim (network [, , i] )) 
if (count [rand. ind[i, 1] , rand. ind[i, 2]] >0) { 
for (j in 1: length (which (network [, ,i]==l))) { 
look [l,]<-c (net . ind[j , 1] -l,net . ind[j ,2] ) 
look [2 , ] <-c (net . ind [j , 1] +1 ,net . ind [j , 2] ) 
look [3 , ] <-c (net . ind [ j , 1] , net . ind [ j , 2] -1) 
look[4,]<-c(net.ind[j ,1] ,net . ind[j ,2]+l) 
for (k in 1:4) { 

if (0<look[k,l] & look[k,l]<=20 & 0<look[k,2] & look [k, 2] <=20) { 

if (count [look [k,l] ,look[k,2]]==0) {edge [look [k, 1] ,look[k,2] ,i]<-l} 
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> 

if (0<look[k,l] & look[k,l]<=20 & 0<look[k,2] & look [k, 2] <=20) { 

if (count [look [k,l] ,look[k,2]] >0) {newnet [look[k, 1] ,look[k,2] , i]<-l} 
> 

} 
} 
} 
> 

y [i]<-sum( count [which (network [, , i]==l)] ) 
m[i]<-sum (network [, ,i] ) 
} 

w<-y/m 

mu . acs [h] <-sum (w) /n 

var.acs [h] <-(l-n/N)*sum((w-mu.acs) ~2)/(n*(n-l)) 
y . srs<-count [rand . ind] 
mu.srs [h] <-sum(y .srs)/n 

var .srs [h] <-(l-n/N)*sum((y.srs-mu.srs) ~2)/(n*(n-l)) 
} 

Mu . acs<-N*mu . acs 

Var . acs<-N~2*var . acs 

Mu . srs<-N*mu . srs 

Var . srs<-N~2*var . srs 

Mu . acs . avg<-sum(Mu . acs) /100 

Mu . acs . spread<-sum( (Mu . acs-Mu . acs . avg) ~2) /100 

Var . acs . avg<-sum(Var . acs) /100 

Var . acs . spread<-sum( (Var . acs-Var . acs . avg) "2) /100 
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Mu. srs.avg<-sum(Mu.srs)/100 

Mu. srs.spread<-sum((Mu.srs-Mu.srs . avg)~2)/100 

Var . srs . avg<-sum(Var . srs) /100 

Var . srs . spread<-sum( (Var . srs-Var . srs . avg) ~2) /100 

plot (cluster. 1, xlim=c(0,20) , ylim=c(0,20) , xlab="Spatial clusters (x) " , 

ylab="Spatial clusters (y) " , main="Simulated populations (Biv. Normal, SD=1)") 

points (cluster .2, xlim=c(0,20) , ylim=c(0,20) , col=2) 

points (cluster .3, xlim=c(0,20) , ylim=c(0,20) , col=3) 

points (cluster .4, xlim=c(0,20) , ylim=c(0,20) , col=4) 

points (cluster .5, xlim=c(0,20) , ylim=c(0,20) , col=5) 

sum(frame [, 1] ) 
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